This page last changed on Feb 01, 2007 by aaime.

Serving big coverage data sets with good performance requires some knowledge and tuning, since usually data is set up for distribution and archival. This tutorial try to provide you with a base knowledge of how data restructing affects performance, and how to use the available tools to get optimal data serving performance.

Choose the right format

First key element is choosing the right format. Some formats are designed for data exchange, others for data rendering and serving. A good data serving format is binary, allows for multi-resolution extraction, and provides support for quick subset extraction at native resolutions.
Examples of such formats are GeoTiff, ECW, JPEG 2000 and MrSid. ArcGrid is instead an example of format that's particularly ill-suited for large dataset serving (it's text based, no multi-resolution, and we have to read it fully even to extract a data subset in the general case).
At the moment we don't have support for vawelets based compression formats (WCS, JPEG2000 and MrSid) so your only choice is Geotiff. You can use GDAL, and in particular gdal_translate, to do the conversion works from common formats. This tutorial shows you how to convert an ECW image.

Setup Geotiff data for fast rendering

As soon as your Geotiffs gets beyond some tens of megabytes you'll want to add the following capabilities:

  • inner tiling
  • overviews

Inner tiling sets up the image layout so that it's organized in tiles instead of simple stripes (rows). This allows much quicker access to a certain area of the geotiff, and the Geoserver readers will leverage this by accessing only the tiles needed to render the current display area. The following sample command instructs gdal_translate to create a tiled geotiff .

gdal_translate -of GTiff -projwin -180 90 -50 -10 -co "TILED=YES" bigDataSet.ecw myTiff.tiff

Overviews are downsampled version of the same image, that is, a zoomed out version, which is usually much smaller. When Geoserver needs to render the Geotiff, it'll look for the most appropriate overview as a starting point, thus reading and converting way less data. Overviews can be added using gdaladdo, or the the OverviewsEmbedded command included in Geotools. Here is a sample of using gdaladdo to add overviews that are downsampled 2, 4, 8 and 16 times compared to the original:

gdaladdo mytiff.tif 2 4 8 16

For more hands on information on how to use GDAL utilites along with Geoserver, have a look at the BlueMarble data loading tutorial.

As a final note, Geotiff supports various kinds of compression, but we do suggest to not use it. Whilst it allows for much smaller files, the decompression process is expensive and will be performed on each data access, significantly slowing down rendering. In our experience, the decompression time is higher than the pure disk data reading.

Handling huge data sets

If you have really huge data sets (several gigabytes), odds are that simply adding overviews and tiles does not cut it, making intermediate resolution serving slow. This is because tiling occurs only on the native resolution levels, and intermediate overviews are too big for quick extraction.

So, what you need is a way to have tiling on intermediate levels as well. This is supported by the ImagePyramid plugin.
This plugin assumes you have create various seamless image mosaics, each for a different resolution level of the original image. In the mosaic, tiles are actual files (for more info about mosaics, see the mosaics creation tutorial). The whole pyramid structures looks like the following:

rootDirectory
    +- pyramid.properties
    +- 0
       +- mosaic metadata files
       +- mosaic_file_0.tiff
       +- ...
       +- mosiac_file_n.tiff
    +- ...
    +- 32
       +- mosaic metadata files
       +- mosaic_file_0.tiff
       +- ...
       +- mosiac_file_n.tiff

Creating a pyramid by hand can theoretically be done with gdal, but in practice it's a daunting task that would require some scripting, since gdal provides no "tiler" command to extract regular tiles out of an image, nor one to create a downsampled set of tiles. As an alternative, you can use the geotools PyramidBuilder tool (documentation on how to use this is pending, contact the developers if you need to use it).

Document generated by Confluence on Jan 16, 2008 23:27